More Risk-Sensitive Markov Decision Processes

نویسندگان

  • Nicole Bäuerle
  • Ulrich Rieder
چکیده

We investigate the problem of minimizing a certainty equivalent of the total or discounted cost over a finite and an infinite horizon which is generated by a Markov Decision Process (MDP). The certainty equivalent is defined by U−1(EU(Y )) where U is an increasing function. In contrast to a risk-neutral decision maker this optimization criterion takes the variability of the cost into account. It contains as a special case the classical risk-sensitive optimization criterion with an exponential utility. We show that this optimization problem can be solved by an ordinary MDP with extended state space and give conditions under which an optimal policy exists. In the case of an infinite time horizon we show that the minimal discounted cost can be obtained by value iteration and can be characterized as the unique solution of a fixed point equation using a ’sandwich’ argument. Interestingly, it turns out that in case of a power utility, the problem simplifies and is of similar complexity than the exponential utility case, however has not been treated in the literature so far. We also establish the validity (and convergence) of the policy improvement method. A simple numerical example, namely the classical repeated casino game is considered to illustrate the influence of the certainty equivalent and its parameters. Finally also the average cost problem is investigated. Surprisingly it turns out that under suitable recurrence conditions on the MDP for convex power utility U , the minimal average cost does not depend on U and is equal to the risk neutral average cost. This is in contrast to the classical risk sensitive criterion with exponential utility.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Risk-Sensitive and Average Optimality in Markov Decision Processes

Abstract. This contribution is devoted to the risk-sensitive optimality criteria in finite state Markov Decision Processes. At first, we rederive necessary and sufficient conditions for average optimality of (classical) risk-neutral unichain models. This approach is then extended to the risk-sensitive case, i.e., when expectation of the stream of one-stage costs (or rewards) generated by a Mark...

متن کامل

Analysis of a risk-sensitive control problem for hidden Markov chains

In this paper the risk-sensitive control of parially observed Markov decision processes is considered. The replacement problem is analyzed in this context, and the structure of risk sensitive optimal controllers is given.

متن کامل

A Short Note on Combining Multiple Policies in Risk-Sensitive Exponential Average Reward Markov Decision Processes

This short note presents a method of combining multiple policies in a given policy set such that the resulting policy improves all policies in the set for risk-sensitive exponential average reward Markov decision processes (MDPs), extending the work of Howard and Matheson for the singleton policy set case. Some applications of the method in solving risk-sensitive MDPs are also discussed.

متن کامل

Risk-sensitive and minimax control of discrete-time, finite-state Markov decision processes

This paper analyzes a connection between risk-sensitive and minimax criteria for discrete-time, nite-states Markov Decision Processes (MDPs). We synthesize optimal policies with respect to both criteria, both for nite horizon and discounted in nite horizon problem. A generalized decision-making framework is introduced, which includes as special cases a number of approaches that have been consid...

متن کامل

Solving Risk-Sensitive POMDPs With and Without Cost Observations

Partially Observable Markov Decision Processes (POMDPs) are often used to model planning problems under uncertainty. The goal in Risk-Sensitive POMDPs (RS-POMDPs) is to find a policy that maximizes the probability that the cumulative cost is within some user-defined cost threshold. In this paper, unlike existing POMDP literature, we distinguish between the two cases of whether costs can or cann...

متن کامل

Risk-Sensitive Control of Markov Decision Processes

This paper introduces an algorithm to determine near-optimal control laws for Markov Decision Processes with a risk-sensitive criterion. Both the fully observed and the partially observed settings are considered, for nite and innnite horizon formulations. Dynamic programming equations are introduced which characterize the value function for the partially observed, innnite horizon , discounted c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Math. Oper. Res.

دوره 39  شماره 

صفحات  -

تاریخ انتشار 2014